| notebook.community

32.1-2

Design:
Algorithm($T$, $P$):
$n := T.length$
$m := P.length$
$i := 1$
$k := 1$
while $i+k-1 \leq n$:
if $T[i+k-1]$ != $P[k]$:
$i := i + max(k-1, 1)$
$k := 1$
else:
$k := k + 1$

if $k$ > $m$:
print( 'Pattern occurs with shift ' $i-1$ )
$i := i + m$
$k := 1$

Correctness:
Variable $i$ is an index variable for $T$, and $k$ is an index variable for $P$.
Observe that the index variable $i$ steps forward with increment $max(k-1, 1)$ for $k \leq m$. We may consider two cases:

$k-1 > 0$:
The subarray $T[i..i+k-2]$ is not empty, so by our hypothesis, $P[1..k-1]$ matches $T[i..i+k-2]$ implies that $T[j] \neq P[1]$ for all $i < j < i+k-1$. Hence, we can move index $i$ with value $k-1$, reset index $k$ and continue.
$k-1$ == $0$:
If this case occurs, then we must have $k$ == 1 $\implies$ $T[i] \neq P[1]$. So we can increase index $i$ by $1$, reset index $k$ and continue.

Since index $k$ denotes the progress of pattern matching. Therefore, if $k$ goes across $m$, it means we have checked through all elements in $P$ and no mismatch occurs, then a pattern in $T$ is found, and by our hypothesis, we can move index $i$ with value of pattern length $m$ to continue pattern matching process.

Analysis:
Consider the value of $i+k-1$. In the start of some iteration of the while-loop, we may concern three cases for value of $k$:

$k$ == $1$:
No doubt, $i+k-1$ will rise by $1$ for all possible result of the first conditional statement.
$2 \leq k < m$:
If the first conditional statement fails, then $i+k-1$ will rise by $1$. Otherwise, $n+k-1$ will stay on the same value and reset $k$ to $1$, then it guarantees that $i+k-1$ will be added by $1$ in the next iteration.
$k$ == $m$:
If the first conditional statement fails, then $i+k-1$ will rise by $1$, and the second conditional statement will reset $k$ to $1$ but keep $i+k-1$ same as the result of the first conditional statement.

By our discussion above, $i+k-1$ must be increased by at least $1$ in any $2$ adjacent iterations. Thus, the boundary condition $i+k-1 \leq n$ guarantees iteration times will never exceed $2n$. $\implies$ The time complexity is $O(n)$.

32.1-4

We may assume that:

Gap characters will never be the first or the last elements.
Gap characters will never be put adjacently.

Algorithm:
Gap-Matcher($T[1 ... n]$, $P[1 ... m]$):
let $g :=$ number of gap characters in $P$
partition $P$ into $g+1$ substrings $P_1, ... , P_{g+1}$ by gap characters. i.e. $P = P_1 \diamond P_2 \diamond ... \diamond P_g \diamond P_{g+1}$

$P_0$ := '' // empty string
$s := 1$
for $i$ from $1$ to $g+1$:
do KMP preprocessing for $P_i$
use KMP algorithm to find the first occurence of $P_i$ in $T[s+P_{i-1}.length ... n]$ and then assign the shift to $s$; otherwise, assign $s := 0$
if $s$ == $0$:
return False

return True

Analysis:
There are three non-constant time part:

Partition operation takes $O(m)$ time.
All KMP preprocessing parts take $\Theta(m)$ time in total, since $\Sigma_{k=1}^{g+1}{P_i.length} \leq m$ and KMP preprocessing in each iteration takes $\Theta(P_i.length)$ time.
KMP algorithm: Observe that, the ranges which KMP algorithm scanned in each iteration are not overlapped with each other.
Therefore, KMP algorithm parts in all iterations take $\Theta(n)$ time.

Hence, the time complexity is $O(m) + \Theta(m) + \Theta(n) = \Theta(m+n)$.

32.4-7

Algorithm:
Determine-Cyclic-Rotation($T[1 ... n]$, $T'[1 ... n]$):
let $T^2[1 ... 2n]$ be a new string
fill up $T^2$ with $2$ times of $T$
do KMP preprocessing for pattern $T'$
use KMP algorithm to find pattern $T'$ occurs in $T^2$ or not, and return the result

Correctness:
Our algorithm returns True $\iff$ there exists $m$ with $0 \leq s < n-1$ s.t.
$T^2[1+s ... n]$ == $T[1+s ... n]$ == $T'[1 ... n-s]$ and $T^2[n+1 ... n+s]$ == $T[1 ... s]$ == $T'[n-s+1 ... n]$.

Analysis:

Construction of $T^2$ takes $O(2n) = O(n)$ time.
Initialization of $T^2$ takes $O(2n) = O(n)$ time.
KMP preprocessing for pattern $T'$ takes $\Theta(n)$ time.
KMP algorithm takes $\Theta(2n) = \Theta(n)$ time.

Hence, our algorithm takes $O(n) + O(n) + \Theta(n) + \Theta(n) = \Theta(n)$ time.

21.2-2

[$x_1$, $x_2$, $x_3$, $x_4$, $x_5$, $x_6$, $x_7$, $x_8$, $x_9$, $x_{10}$, $x_{11}$, $x_{12}$, $x_{13}$, $x_{14}$, $x_{15}$, $x_{16}$]
FIND-SET($x_2$) : $x_1$
FIND-SET($x_9$) : $x_1$

21.3-3

Let $x_1$, ... , $x_n$ be $n$ distinct elements and $h = \lfloor \lg n \rfloor$.

for $i$ from $1$ to $n$:
MAKE-SET($x_i$)
for $j$ from $1$ to $h$:
for $k$ from $1$ to $2^h-2^j+1$ by $2^j$:
UNION($x_{k + 2^{j-1}}$, $x_k$)
for $i$ from $1$ to $n + 2^h + 1$:
FIND-SET($x_{2^h}$)

$\because$ The nested for-loop unites $n$ sets $[x_1]$, ... , $[x_{2^h}]$ to one set $[x_1, ... , x_{2^h}]$.
$\therefore$ We take $2^h - 1$ UNION operations in total.
Put $m = n + (2^h-1) + (n + 2^h + 1) = 2n + 2 \cdot 2^{h}$.
Then we have a sequence of $m$ operations which consists with $n$ MAKE-SET, $\lfloor \lg n \rfloor - 1$ UNION and $ n + \lfloor \lg n \rfloor + 1$ FIND-SET operations.

Next, let's consider the outer for-loop of the nested for-loop. Observe that we unite all pairs of set, ($x_{k + 2^{j-1}}$, $x_k$), it's with same the rook rank, at the end of each iteration. Since we put the root of second set upon the root of first set during UNION operation of the same root rank, therefore FIND-SET($x_{2^h}$) will go through $x_{2^{h-1}}$, ... , $x_{2^1}$, $x_{2^0}$. $\implies$ It takes $\Omega(h) = \Omega(\lfloor \lg n \rfloor)$ time.

Hence our procedures takes $\Omega((n + \lfloor \lg n \rfloor + 1)\lfloor \lg n \rfloor) \geq \Omega(\frac{m}{2} \lg n) = \Omega(m \lg n)$.